AITopics

2509.01886

Country: Asia (1.00)

Genre:

Overview (1.00)
Research Report > New Finding (0.66)

Industry:

Transportation > Infrastructure & Services (0.89)
Transportation > Ground > Road (0.89)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Sudjianto, Agus, Burakov, Denis

An Information-Theoretic Framework for Credit Risk Modeling: Unifying Industry Practice with Statistical Theory for Fair and Interpretable Scorecards

arXiv.org Machine LearningSep-15-2025

Credit risk modeling relies extensively on Weight of Evidence (WoE) and Information Value (IV) for feature engineering, and Population Stability Index (PSI) for drift monitoring, yet their theoretical foundations remain disconnected. We establish a unified information-theoretic framework revealing these industry-standard metrics as instances of classical information divergences. Specifically, we prove that IV exactly equals PSI (Jeffreys divergence) computed between good and bad credit outcomes over identical bins. Through the delta method applied to WoE transformations, we derive standard errors for IV and PSI, enabling formal hypothesis testing and probabilistic fairness constraints for the first time. We formalize credit modeling's inherent performance-fairness trade-off as maximizing IV for predictive power while minimizing IV for protected attributes. Using automated binning with depth-1 XGBoost stumps, we compare three encoding strategies: logistic regression with one-hot encoding, WoE transformation, and constrained XGBoost. All methods achieve comparable predictive performance (AUC 0.82-0.84), demonstrating that principled, information-theoretic binning outweighs encoding choice. Mixed-integer programming traces Pareto-efficient solutions along the performance-fairness frontier with uncertainty quantification. This framework bridges theory and practice, providing the first rigorous statistical foundation for widely-used credit risk metrics while offering principled tools for balancing accuracy and fairness in regulated environments.

application, divergence, information value, (15 more...)

2509.09855

Country:

North America > United States > North Carolina > Mecklenburg County > Charlotte (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > Experimental Study (0.67)

Industry:

Banking & Finance > Credit (1.00)
Banking & Finance > Risk Management (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

arXiv.org Machine LearningMay-26-2025

Asymptotically optimal regret in communicating Markov decision processes

Boone, Victor

In this paper, we present a learning algorithm that achieves asymptotically optimal regret for Markov decision processes in average reward under a communicating assumption. That is, given a communicating Markov decision process $M$, our algorithm has regret $K(M) \log(T) + \mathrm{o}(\log(T))$ where $T$ is the number of learning steps and $K(M)$ is the best possible constant. This algorithm works by explicitly tracking the constant $K(M)$ to learn optimally, then balances the trade-off between exploration (playing sub-optimally to gain information), co-exploration (playing optimally to gain information) and exploitation (playing optimally to score maximally). We further show that the function $K(M)$ is discontinuous, which is a consequence challenge for our approach. To that end, we describe a regularization mechanism to estimate $K(M)$ with arbitrary precision from empirical data.

artificial intelligence, machine learning, markov decision process, (15 more...)

2505.18064

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceFeb-9-2025

The Value of Information in Human-AI Decision-making

Guo, Ziyang, Wu, Yifan, Hartline, Jason, Hullman, Jessica

As the performance of artificial intelligence (AI) models improves, workflows in which human and AI model-based judgments are combined to make decisions are sought in medicine, finance, and other domains. Though statistical models often make more accurate predictions than human experts on average [Ægisdóttir et al., 2006, Grove et al., 2000, Meehl, 1954], whenever humans have access to additional information over the AI, there is potential to achieve complementary performance by pairing the two, i.e., better performance than either the human or AI alone. For example, a physician may have access to additional information that may not be captured in tabular electronic health records or other structured data [Alur et al., 2024b]. However, evidence of complementary performance between humans and AI is limited, with many studies showing that human-AI teams underperform an AI alone [Buçinca et al., 2020, Bussone et al., 2015, Green and Chen, 2019, Jacobs et al., 2021, Lai and Tan, 2019, Vaccaro and Waldo, 2019, Kononenko, 2001]. A solid understanding of such results is limited by the fact that most analyses of human-AI decision-making focus on ranking the performance of human-AI teams or each individually using measures like posthoc decision accuracy.

information, machine learning, natural language, (20 more...)

2502.06152

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Minnesota (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.68)
Health & Medicine > Health Care Technology > Medical Record (0.54)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
(2 more...)

arXiv.org Artificial IntelligenceDec-6-2024

More than Marketing? On the Information Value of AI Benchmarks for Practitioners

Hardy, Amelia, Reuel, Anka, Meimandi, Kiana Jafari, Soder, Lisa, Griffith, Allie, Asmar, Dylan M., Koyejo, Sanmi, Bernstein, Michael S., Kochenderfer, Mykel J.

Public AI benchmark results are widely broadcast by model developers as indicators of model quality within a growing and competitive market. However, these advertised scores do not necessarily reflect the traits of interest to those who will ultimately apply AI models. In this paper, we seek to understand if and how AI benchmarks are used to inform decision-making. Based on the analyses of interviews with 19 individuals who have used, or decided against using, benchmarks in their day-to-day work, we find that across these settings, participants use benchmarks as a signal of relative performance difference between models. However, whether this signal was considered a definitive sign of model superiority, sufficient for downstream decisions, varied. In academia, public benchmarks were generally viewed as suitable measures for capturing research progress. By contrast, in both product and policy, benchmarks -- even those developed internally for specific tasks -- were often found to be inadequate for informing substantive decisions. Of the benchmarks deemed unsatisfactory, respondents reported that their goals were neither well-defined nor reflective of real-world use. Based on the study results, we conclude that effective benchmarks should provide meaningful, real-world evaluations, incorporate domain expertise, and maintain transparency in scope and goals. They must capture diverse, task-relevant capabilities, be challenging enough to avoid quick saturation, and account for trade-offs in model performance rather than relying on a single score. Additionally, proprietary data collection and contamination prevention are critical for producing reliable and actionable results. By adhering to these criteria, benchmarks can move beyond mere marketing tricks into robust evaluative frameworks.

benchmark, large language model, machine learning, (19 more...)

2412.0552

Country: North America > United States > California > Santa Clara County (0.15)

Genre:

Questionnaire & Opinion Survey (1.00)
Personal > Interview (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Applied AI (0.68)
(2 more...)

arXiv.org Artificial IntelligenceNov-2-2024

Unexploited Information Value in Human-AI Collaboration

Guo, Ziyang, Wu, Yifan, Hartline, Jason, Hullman, Jessica

Humans and AIs are often paired on decision tasks with the expectation of achieving complementary performance -- where the combination of human and AI outperforms either one alone. However, how to improve performance of a human-AI team is often not clear without knowing more about what particular information and strategies each agent employs. In this paper, we propose a model based in statistically decision theory to analyze human-AI collaboration from the perspective of what information could be used to improve a human or AI decision. We demonstrate our model on a deepfake detection task to investigate seven video-level features by their unexploited value of information. We compare the human alone, AI alone and human-AI team and offer insights on how the AI assistance impacts people's usage of the information and what information that the AI exploits well might be useful for improving human decisions.

artificial intelligence, information, machine learning, (19 more...)

2411.10463

Country:

North America > United States > Illinois > Cook County > Evanston (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Minnesota (0.04)
Asia > Malaysia (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Meister, Clara, Giulianelli, Mario, Pimentel, Tiago

Towards a Similarity-adjusted Surprisal Theory

arXiv.org Artificial IntelligenceOct-23-2024

Surprisal theory posits that the cognitive effort required to comprehend a word is determined by its contextual predictability, quantified as surprisal. Traditionally, surprisal theory treats words as distinct entities, overlooking any potential similarity between them. Giulianelli et al. (2023) address this limitation by introducing information value, a measure of predictability designed to account for similarities between communicative units. Our work leverages Ricotta and Szeidl's (2006) diversity index to extend surprisal into a metric that we term similarity-adjusted surprisal, exposing a mathematical relationship between surprisal and information value. Similarity-adjusted surprisal aligns with information value when considering graded similarities and reduces to standard surprisal when words are treated as distinct. Experimental results with reading time data indicate that similarity-adjusted surprisal adds predictive power beyond standard surprisal for certain datasets, suggesting it serves as a complementary measure of comprehension effort.

artificial intelligence, machine learning, natural language, (19 more...)

2410.17676

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(6 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.95)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Surjanovic, Nikola, Henrey, Andrew, Loughin, Thomas M.

Alpha-Trimming: Locally Adaptive Tree Pruning for Random Forests

arXiv.org Machine LearningAug-13-2024

We demonstrate that adaptively controlling the size of individual regression trees in a random forest can improve predictive performance, contrary to the conventional wisdom that trees should be fully grown. A fast pruning algorithm, alpha-trimming, is proposed as an effective approach to pruning trees within a random forest, where more aggressive pruning is performed in regions with a low signal-to-noise ratio. The amount of overall pruning is controlled by adjusting the weight on an information criterion penalty as a tuning parameter, with the standard random forest being a special case of our alpha-trimmed random forest. A remarkable feature of alpha-trimming is that its tuning parameter can be adjusted without refitting the trees in the random forest once the trees have been fully grown once. In a benchmark suite of 46 example data sets, mean squared prediction error is often substantially lowered by using our pruning algorithm and is never substantially increased compared to a random forest with fully-grown trees at default parameter settings.

algorithm, random forest, regression tree, (13 more...)

2408.07151

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Burnaby (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Giulianelli, Mario, Wallbridge, Sarenne, Fernández, Raquel

Information Value: Measuring Utterance Predictability as Distance from Plausible Alternatives

arXiv.org Artificial IntelligenceOct-20-2023

Giulianelli and Fernández, 2021; Wallbridge When viewed as information transmission, successful et al., 2022). However, token-level autoregressive language production can be seen as an act approximations of utterance probability have a of reducing the uncertainty over future states that a few problematic properties. A well-known issue comprehender may be anticipating. Saying a word, is that different realisations of the same concept for example, may cut the space of possibilities in or communicative intent compete for probability half, while uttering a whole sentence may restrict mass (Holtzman et al., 2021), which implies that the comprehender's expectations to a far smaller the surprisal of semantically equivalent realisations space. Measuring the amount of information is overestimated. Moreover, token-level carried by a linguistic signal is fundamental to surprisal estimates conflate different dimensions of the computational modelling of human language predictability. As evidenced by recent studies (Arehalli processing. Such quantifications are used in et al., 2022; Kuhn et al., 2023), this makes psycholinguistic and neurobiological models of it difficult to appreciate whether the information language processing (Levy, 2008; Willems et al., carried by an utterance is a result, for example, of 2016; Futrell and Levy, 2017; Armeni et al., 2017), the unexpectedness of its lexical material, syntactic to study the processing mechanisms of neural arrangements, semantic content, or speech act type.

information value, reading time, surprisal, (14 more...)

2310.13676

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Dominican Republic (0.04)
(12 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Rojas, Helder, Alvarez, Cirilo, Rojas, Nilton

Statistical Hypothesis Testing for Information Value (IV)

arXiv.org Machine LearningSep-29-2023

Information value (IV) is a quite popular technique for features selection before the modeling phase. There are practical criteria, based on fixed thresholds for IV, but at the same time mysterious and lacking theoretical arguments, to decide if a predictor has sufficient predictive power to be considered in the modeling phase. However, the mathematical development and statistical inference methods for this technique are almost nonexistent in the literature. In this paper we present a theoretical framework for IV, and at the same time, we propose a non-parametric hypothesis test to evaluate the predictive power of features contemplated in a data set. Due to its relationship with divergence measures developed in the Information Theory, we call our proposal the J - Divergence test. We show how to efficiently compute our test statistic and we study its performance on simulated data. In various scenarios, particularly in unbalanced data sets, we show its superiority over conventional criteria based on fixed thresholds. Furthermore, we apply our test on fraud identification data and provide an open-source Python library, called "statistical-iv"(https://pypi.org/project/statistical-iv/), where we implement our main results.

artificial intelligence, imbalance, machine learning, (17 more...)

2309.13183

Country:

South America > Peru > Lima Department > Lima Province > Lima (0.05)
North America > United States > North Carolina (0.04)
Europe > Netherlands > Gelderland > Nijmegen (0.04)

Genre: Research Report (1.00)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.41)